image quality
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
- North America > United States > California (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Asia > Middle East > Israel > Haifa District > Haifa (0.04)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
- Asia > China > Hong Kong (0.05)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- (2 more...)
- Health & Medicine (0.46)
- Information Technology (0.46)
AdaptiveISP: Learning an Adaptive Image Signal Processor for Object Detection
Image Signal Processors (ISPs) convert raw sensor signals into digital images, which significantly influence the image quality and the performance of downstream computer vision tasks. Designing ISP pipeline and tuning ISP parameters are two key steps for building an imaging and vision system. To find optimal ISP configurations, recent works use deep neural networks as a proxy to search for ISP parameters or ISP pipelines. However, these methods are primarily designed to maximize the image quality, which are sub-optimal in the performance of high-level computer vision tasks such as detection, recognition, and tracking. Moreover, after training, the learned ISP pipelines are mostly fixed at the inference time, whose performance degrades in dynamic scenes. To jointly optimize ISP structures and parameters, we propose AdaptiveISP, a task-driven and scene-adaptive ISP. One key observation is that for the majority of input images, only a few processing modules are needed to improve the performance of downstream recognition tasks, and only a few inputs require more processing. Based on this, AdaptiveISP utilizes deep reinforcement learning to automatically generate an optimal ISP pipeline and the associated ISP parameters to maximize the detection performance.
Holistic Evaluation of Text-to-Image Models
The stunning qualitative improvement of text-to-image models has led to their widespread attention and adoption. However, we lack a comprehensive quantitative understanding of their capabilities and risks. To fill this gap, we introduce a new benchmark, Holistic Evaluation of Text-to-Image Models (HEIM). Whereas previous evaluations focus mostly on image-text alignment and image quality, we identify 12 aspects, including text-image alignment, image quality, aesthetics, originality, reasoning, knowledge, bias, toxicity, fairness, robustness, multilinguality, and efficiency. We curate 62 scenarios encompassing these aspects and evaluate 26 state-of-the-art text-to-image models on this benchmark. Our results reveal that no single model excels in all aspects, with different models demonstrating different strengths. We release the generated images and human evaluation results for full transparency at https://crfm.stanford.edu/heim/latest
SR-CACO-2: A Dataset for Confocal Fluorescence Microscopy Image Super-Resolution
Confocal fluorescence microscopy is one of the most accessible and widely used imaging techniques for the study of biological processes at the cellular and subcellular levels. Scanning confocal microscopy allows the capture of high-quality images from thick three-dimensional (3D) samples, yet suffers from well-known limitations such as photobleaching and phototoxicity of specimens caused by intense light exposure, which limits its use in some applications, especially for living cells. Cellular damage can be alleviated by changing imaging parameters to reduce light exposure, often at the expense of image quality.Machine/deep learning methods for single-image super-resolution (SISR) can be applied to restore image quality by upscaling lower-resolution (LR) images to produce high-resolution images (HR). These SISR methods have been successfully applied to photo-realistic images due partly to the abundance of publicly available datasets. In contrast, the lack of publicly available data partly limits their application and success in scanning confocal microscopy.In this paper, we introduce a large scanning confocal microscopy dataset named SR-CACO-2 that is comprised of low-and high-resolution image pairs marked for three different fluorescent markers. It allows to evaluate the performance of SISR methods on three different upscaling levels (X2, X4, X8).
Guiding a Diffusion Model with a Bad Version of Itself
The primary axes of interest in image-generating diffusion models are image quality, the amount of variation in the results, and how well the results align with a given condition, e.g., a class label or a text prompt. The popular classifier-free guidance approach uses an unconditional model to guide a conditional model, leading to simultaneously better prompt alignment and higher-quality images at the cost of reduced variation. These effects seem inherently entangled, and thus hard to control. We make the surprising observation that it is possible to obtain disentangled control over image quality without compromising the amount of variation by guiding generation using a smaller, less-trained version of the model itself rather than an unconditional model. This leads to significant improvements in ImageNet generation, setting record FIDs of 1.01 for 64x64 and 1.25 for 512x512, using publicly available networks. Furthermore, the method is also applicable to unconditional diffusion models, drastically improving their quality.
Meet Upscayl: This free AI tool turns blurry photos into high-detail images
When you purchase through links in our articles, we may earn a small commission. Upscayl uses AI technology to scale images to a larger format. The open source tool adds sharpness and brings details to light. There are several modes and models to choose from. The Upscayl tool uses several AI models to enhance your images.
- Leisure & Entertainment > Games > Computer Games (0.45)
- Information Technology > Security & Privacy (0.34)
Explainable Fundus Image Curation and Lesion Detection in Diabetic Retinopathy
Diabetic Retinopathy (DR) affects individuals with long-term diabetes. Without early diagnosis, DR can lead to vision loss. Fundus photography captures the structure of the retina along with abnormalities indicative of the stage of the disease. Artificial Intelligence (AI) can support clinicians in identifying these lesions, reducing manual workload, but models require high-quality annotated datasets. Due to the complexity of retinal structures, errors in image acquisition and lesion interpretation of manual annotators can occur. We proposed a quality-control framework, ensuring only high-standard data is used for evaluation and AI training. First, an explainable feature-based classifier is used to filter inadequate images. The features are extracted both using image processing and contrastive learning. Then, the images are enhanced and put subject to annotation, using deep-learning-based assistance. Lastly, the agreement between annotators calculated using derived formulas determines the usability of the annotations.
- Europe > Romania > Nord-Vest Development Region > Cluj County > Cluj-Napoca (0.04)
- Europe > Croatia > Zagreb County > Zagreb (0.04)
- Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.92)
Evaluating and Preserving High-level Fidelity in Super-Resolution
Rocafort, Josep M., Su, Shaolin, Gomez-Villa, Alexandra, Vazquez-Corral, Javier
Recent image Super-Resolution (SR) models are achieving impressive effects in reconstructing details and delivering visually pleasant outputs. However, the overpowering generative ability can sometimes hallucinate and thus change the image content despite gaining high visual quality. This type of high-level change can be easily identified by humans yet not well-studied in existing low-level image quality metrics. In this paper, we establish the importance of measuring high-level fidelity for SR models as a complementary criterion to reveal the reliability of generative SR models. W e construct the first annotated dataset with fidelity scores from different SR models, and evaluate how state-of-the-art (SOTA) SR models actually perform in preserving high-level fidelity. Based on the dataset, we then analyze how existing image quality metrics correlate with fidelity measurement, and further show that this high-level task can be better addressed by foundation models. Finally, by fine-tuning SR models based on our fidelity feedback, we show that both semantic fidelity and perceptual quality can be improved, demonstrating the potential value of our proposed criteria, both in model evaluation and optimization. W e will release the dataset, code, and models upon acceptance.
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
SustainDiffusion: Optimising the Social and Environmental Sustainability of Stable Diffusion Models
d'Aloisio, Giordano, Fadahunsi, Tosin, Choy, Jay, Moussa, Rebecca, Sarro, Federica
Background: Text-to-image generation models are widely used across numerous domains. Among these models, Stable Diffusion (SD) - an open-source text-to-image generation model - has become the most popular, producing over 12 billion images annually. However, the widespread use of these models raises concerns regarding their social and environmental sustainability. Aims: To reduce the harm that SD models may have on society and the environment, we introduce SustainDiffusion, a search-based approach designed to enhance the social and environmental sustainability of SD models. Method: SustainDiffusion searches the optimal combination of hyperparameters and prompt structures that can reduce gender and ethnic bias in generated images while also lowering the energy consumption required for image generation. Importantly, SustainDiffusion maintains image quality comparable to that of the original SD model. Results: We conduct a comprehensive empirical evaluation of SustainDiffusion, testing it against six different baselines using 56 different prompts. Our results demonstrate that SustainDiffusion can reduce gender bias in SD3 by 68%, ethnic bias by 59%, and energy consumption (calculated as the sum of CPU and GPU energy) by 48%. Additionally, the outcomes produced by SustainDiffusion are consistent across multiple runs and can be generalised to various prompts. Conclusions: With SustainDiffusion, we demonstrate how enhancing the social and environmental sustainability of text-to-image generation models is possible without fine-tuning or changing the model's architecture.
- Europe > Italy > Abruzzo > L'Aquila Province > L'Aquila (0.76)
- South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.05)
- North America > United States > New York > New York County > New York City (0.05)
- (9 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study > Negative Result (0.46)